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Remarks 

Regarding amendments in the Specification 

The terms "panel of markers" and "marker panel" have been added to the specification by amendment 
so that there is antecedent basis in the claims. The terms "panel of markers" and "marker panel" are 
well known in the art and have the same usage as the term "set of markers" in paragraph [018]. 
Applicants believe that this change is more reflective of the usage of these terms in the art and 
increases claim clarity. 

Regarding amendments in the Claims: 

The Examiner rejected claim 5 in the First Office Action dated August 23, 2004. Applicants have 
canceled claim 5. 

Applicants are entitled to a total of 20 claims, including 3 independent claims subsequent to the RCE 
filed in the present application on March 4, 2004. Nineteen new claims, claims 6 through 24 inclusive (of 
which 2 claims are independent claims), have been added to the application and do not require 
additional claim fees. 

Regarding new claim 6, this claim is essentially the same as pending, previously allowed claim 3 in 
related application No. 09/623, 068. The only difference being that the words "collection of points" in 
claim 3 has been changed to "collection of one or more points" in claim 6. This is supported by 
paragraph [0050]. If necessary to obviate a double patenting rejection, applicants will file a letter 
expressly abandoning application No. 09/623, 068. In addition, such action would allow the 
concentration of attention and economic resources in the present application. The subject matter of 
application No. 09/623, 068 is incorporated by reference into the present application in its entirety; and 
the present application claims priority from application No. 09/623, 068. 
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The Examiner rejected many of the claims in related application No. 09/623, 068 on the basis of failure 
to satisfy the written description requirement. In the event of traversal of the rejections, applicants were 
requested to indicate with particularity where in the specification support for limitations in the claims 
were to be found. (Applicants respectfully submit that at least some such support was cited in the 
Remarks section of the Amendment/Response of June 29, 2004 and in the two supplemental 
responses.) In order to expedite allowance of the presently pending claims in the present application, 
applicants will indicate with particularity support in the specification for limitations in the presently 
pending claims. This support will be cited in terms of paragraph numbers in square brackets. In addition, 
support is cited in the inventor's paper Annals of Human Genetics, 1998, vol 62, pp. 159-179, 
abbreviated AHG98 herein. This paper is incorporated by reference into the present patent application 
and a copy of AHG98 is also included herewith for the Examiner's convenience. 

Applicants will cite support in the specification that would be apparent to a person of ordinary skill in the 
art. Regarding the written description requirement, applicants respectfully cite some quotes from the 
MPEP and case law. M [T]he 'essential goal* of the description of the invention requirement is to clearly 
convey the information that an applicant has invented the subject matter which is claimed." In re Barker, 
559 F.2d 588, 592 n.4, 194 USPQ 470, 473 n.4 (CCPA 1977). To satisfy the written description 
requirement, a patent specification must describe the claimed invention in sufficient detail that one 
skilled in the art can reasonably conclude that the inventor had possession of the claimed invention. 
See, e.g., Vas-Cath, Inc. v. Mahurkar, 935 F.2d at 1563, 19 USPQ2d at 1116. 

What is conventional or well known to one of ordinary skill in the art need not be disclosed in detail. See 
Hybritech Inc. v. Monoclonal Antibodies, Inc., 802 F.2d at 1384, 231 USPQ at 94. If a skilled artisan 
would have understood the inventor to be in possession of the claimed invention at the time of filing, 
even if every nuance of the claims is not explicitly described in the specification, then the adequate 
description requirement is met. See, e.g., Vas-Cath, 935 F.2d at 1563, 19 USPQ2d at 1116; Martin v. 
Johnson, 454 F.2d 746, 751, 172 USPQ 391, 395 (CCPA 1972) (stating "the description need not be in 
/jbsfe verbis [i.e., "in the same words"] to be sufficient"). 

In addition, "Adequate description under the first paragraph of 35 USC 112 does not require literal 
support for the claimed invention... Rather, it is sufficient if the originally-filed disclosure would have 
reasonably conveyed to one having ordinary skill in the art that an appellant had possession of the 
concept of what is claimed." Ex parte Parks, 30 USPQ 2d 1234 (B.P.A.I. 1992). 
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Regarding new claim 7, the following support is cited. For the limitation "wherein the CL-F region is for 
a species and a population and the population is a group of individuals as in the field of population 
genetics", e.g., see paragraphs [0175] bottom and [0135] (the term population is used in a statistical 
sense and in the sense the term population is used in the field of population genetics), e.g., see also 
[0306] the term population here is used as in the field of population genetics (e.g., Finnish populations). 
For the limitations "wherein the CL-F region is N covered to within [x, y] by the two or more bi-allelic 
covering markers, wherein x is less than or equal to about D C t_ or the equivalent thereof and y is less 
than or equal to about 0.2, D C l is equal to the largest chromosomal length, computed by any method, for 
which linkage disequilibrium has been observed between any polymorphisms in any population of the 
species, N is an integer greater than or equal to f e.g., see [0178], [0179], [0180], and [0080]. 

For the limitation in new claim 7: "wherein the choice of covering markers is not based on the 
assumption that a covering marker is the trait-causing polymorphism" e.g., see [0027] mid paragraph 
and [0029] mid to bottom paragraph (i.e. there is increased power even when the disease or trait 
causing allele is not the analyzed allele and m/p ratio departs from unity and, or linkage disequilibrium 
between the analyzed marker and trait-causing (disease) polymorphism is not maximal.) The same 
concepts are found throughout the application for example Table 2 p. 21 of the present application and 
Tables 1 , 2 and 3 in AHG 98 pp. 165, and 167 all show increased power when the analyzed marker is 
not the trait-causing (or disease) polymorphism. (Disease is a genetic trait see [0059].) The Risch & 
Merikangas analysis is referred to in the Background of this patent application (e.g., see Risch & 
Merikangas analysis and the Muller-Myshok and Abel criticism/letter [0027]). In the Risch & Merikangas 
analysis, the TDT was assumed to test the disease locus itself or a perfectly associated bi-allelic 
marker, i.e. association with m = p and 5 = 8 max (e.g., see AHG 98 pp. 166 bottom, 169 top and 
Background [0027], [0029]); perfectly associated markers are also discussed on pp. 178, 179 of AHG 
98 and [0316]). For each TDT association study or association test in the Risch & Merikangas analysis a 
marker allele with known association to each possible disease-causing polymorphism allele (in each 
such association study or test) is tested; and the known association values or data is m = p and 8 = 5 max 
for each such possible disease-causing polymorphism. The inventor's work has extended the analysis 
of Risch & Merikangas from the optimal situation of a perfectly associated marker for each possible 
disease-causing polymorphism (in each association study or test) to include the more common, less 
optimal situations in which m * p and, or 5 * 5 max for tested marker-possible disease-causing 
polymorphism pairs (.e.g., see [0027, [0029] and final paragraph of AHG 98 pp. 170 bottom, 171 top.) 
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For the limitation in new claim 7: "wherein the group of two or more covering markers is not an 
essentially one-dimensional panel of markers for a linkage study, wherein the essentially one- 
dimensional panel is a panel not based on using similarity of marker allele frequency and possible trait- 
causing polymorphism allele frequency to increase the power of an association-based linkage test to 
detect evidence for linkage" see, e.g. [0019], [0020], top [0035] (i.e., conventional linkage study 
techniques are essentially one dimensional, focus on the dimension of chromosomal location but give 
little attention to the dimension of allele frequency) and see, e.g. [0308] "It is well known that increased 
disequilibrium between a marker and linked disease locus increases evidence for linkage provided by 
association-based linkage tests such as the TDT. However, what has not been recognized is that the 
specific allele frequencies of the marker locus can also have an enormous impact on the strength of 
evidence for linkage." And see a rendition of the principle that the inventor discovered: e.g. [0285] i.e., 
the power of association-based tests for linkage are increased as the allele frequencies of the disease- 
causing (or trait-causing) allele of a bi-ailelic gene (or polymorphism) and a positively associated allele 
of a linked bi-allelic marker become similar in magnitude. That is, conventional (essentially one- 
dimensional) techniques are not based on using similarity of marker allele frequency and possible trait- 
causing polymorphism allele frequency to increase the power of an association-based linkage test to 
detect evidence for linkage . In addition, the application has been amended to include the well-known 
terms "panel of markers" and "marker panel" which have the same usage as the term "set of markers" in 
paragraph [0018]. 

For the record, the applicants note that the linkage disequilibrium in the well known principle quoted 
above from [0308] is essentially measured in a specific way: i.e. the increased disequilibrium is 
computed respectively as 5/6 ma x for 5 > 0 or 8/5 m j n for 8 < 0, wherein each of the 5 values is a value of 
the coefficient of disequilibrium. This is the way (or essentially the way) that increased linkage 
disequilibrium is computed in the application in paragraphs [0291], [0292], in Table 2 on page 21 , in 
AHG 98 in Tablesl, 2, and 3 pp. 165, 167. 

For support for new claim 8, see, e.g. [0075], [0050], [0090], that is CL-F regions may be large or 
small and a segment-subrange is a kind of CL-F region. 
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For support for new claim 9, support for the limitations beginning with "wherein the CL-F region is for 
a species and a population" and that ends with "not based on using similarity of marker allele frequency 
and possible trait-causing polymorphism allele frequency to increase the power of an association-based 
linkage test to detect evidence for linkage" has been cited above under new claim 7. For the remaining 
limitations in the claim, see, e.g. 10160], i.e. any method of systematically covering a CL-F region is 
acceptable. Such a method is taught in the Set/Subset Example paragraphs [0301] through [0321] 
inclusive of the Theory of Operation/Set/Subset Example [0281]. In this Set/Subset Example, a CL-F 
region is systematically covered using covering markers that are members of sets and subsets. Marker 
set and subset membership is based on the markers being located on particular chromosomal 
segments and having a particular least common allele frequencies. Each of the whereby clauses in this 
claim merely states the result of the invention recited in the claim and is not a limitation. As stated by the 
Federal Circuit, "[a] 'whereby' clause that merely states the result of limitations in the claim adds nothing 
to the patentability and substance of the claim." Texas Instruments, Inc. v. U.S. Int'l Trade Commission, 
988 F 2d 1165, 1172, 26 USPQ2d 1018, 1023 (Fed. Cir. 1993). 

For support for new claim 10, for the first limitation, see [0321], which states that Step 3 described in 
[0313] to [0317] is not essential (but it does increase efficiency). See, e.g. also [0271] which states that 
limiting the number of pairs of redundant covering markers is not crucial, but does increase efficiency. 
The application teaches the covering of a rectangular CL-F region using sets and subsets of covering 
markers; see, e.g. the Set/Subset Example paragraphs [0301] through [0321] cited supporting new 
claim 9 above, and see also, e.g., [0283] (marker least common allele frequencies vary systematically 
over a range or subrange and marker chromosomal locations vary systematically over one or more 
chromosomes or chromosomal regions), [0075] (A CL-F region may be large or small and can range 
over an entire chromosome or only a very small segment; and can range over the entire frequency 
range 0 to 0.5 or alternatively over only a very small subrange.); see also, e.g. [0321] which states that 
"versions of the invention are operable and have utility for any subrange of the least common allele 
frequency range 0 to 0.5" and [0185] ( A segment-subrange is a rectangular CL-F region). The covered 
rectangular region is bounded by a chromosome or subregion of interest in the chromosomal location 
dimension and by a subrange in the allele frequency dimension. Each of the whereby clauses in this 
claim merely states the result of the invention recited in the claim and is not a limitation. 

Regarding support for new claim 11, human being is a species described in the application. 
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Regarding support for new claim 12, see, e.g. [0321] which states that "versions of the invention are 
operable and have utility for any subrange of the least common allele frequency range 0 to 0.5" and see 
{0311] which describes covering the subrange "below 0.1 /above 0.9" (i.e. the least common allele 
frequency subrange 0 to less than 0.1), see also, e.g. [0075] which says, "the least common allele 
frequency coordinates of CL-F points in a particular CL-F region can range over only a very small 
subrange" and gives an example of a subrange (the subrange 0.1 to 0.2) of small width (0.1). This 
width, 0.1, is the same as the width of the subrange 0 to less than 0.1. 

Regarding support for new claim 13, "N > 2" in claim 10 (from which claim 13 depends) literally 
means that N is greater than or equal to 2. As stated in [0182] "In general, the greater N is, the greater 
the power of a version of the invention for linkage studies". Thus the specification supports higher 
values of N, i.e. N>2. 

Regarding support for new claim 14, see, e.g. [0324], which recites "thousands of bi-allelic markers". 
It was well-known in the art that large numbers of bi-allelic markers (e.g., thousands) would be available 
for use in linkage studies. For example, in the analysis of Risch and Merikangas [0027], 500,000 bi- 
allelic markers are studied. And the inventor's paper is a generalization of the Risch and Merikangas 
analysis [0029] (i.e. Risch and Merikangas is a special case of the inventor's general framework), see 
also e.g., [0249] which recites thousands of markers. 

Regarding support for new claim 15, see, e.g. [0169], [0170], [0171]. See also, e.g. [0285] through 
[0289] inclusive and [0296] which describe increases in power along the allele frequency dimension, i.e. 
one or more gradients in power along the allele frequency dimension. 

Regarding new claim 16, this claim is essentially the same as pending, previously allowed claim 78 in 
related application No. 09/623, 068. As stated above, if necessary to obviate a double patenting 
rejection, applicants will file a letter expressly abandoning application No. 09/623, 068. A difference 
being that the words "collection of points" in claim 78 has been changed to "collection of one or more 
points" in claim 16. This is supported by paragraph [0050]. 
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The only other difference between claim 16 in the present application and claim 78 in the '068 
application is that the words "substantially complementary" in claim 78 have been changed to simply 
"complementary" in claim 16. This brings claim 16 into verbatim conformity with the specification at, for 
example, [0265]. Claim 78 contains the words "substantially complementary". Given the definition of an 
oligonucleotide that is "complementary" [0141], the term "complementary" in this definition is essentially 
the same as "substantially complementary". And the applicants respectfully submit that the word 
"substantially" in "substantially complementary" of claim 78 is redundant. And the single word 
"complementary" means the same thing as "substantially complementary" in this context of claim 78. 
The applicants respectfully submit that a change from "substantially complementary" to 
"complementary" in claim 78 is a formality and does not change claim scope. 

Regarding new claims 17 to 24, the limitations in these claims are very similar or essentially the same 
as those in new claims 7 to 15. Applicants have previously cited support in the specification for these 
limitations and the Examiner is respectfully referred to these comments above. 
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Conclusion 

Single pending claim 5, rejected in the first Office Action post RCE, has been canceled. Nineteen new 
claims, claims 6 through 24 inclusive (of which 2 claims are independent claims), have been added to 
the application and do not require any additional claim fees. An appropriate payment of the fee for a one 
month extension is enclosed. The two new added independent claims, 6 and 16, are essentially the 
same as previously allowed claims in related application No. 09/623, 068. If necessary to obviate a 
double patenting rejection, applicants will file a letter expressly abandoning application No. 09/623, 068. 
Applicants have also cited parts of the present specification which support the new claims. (These 
citations of support are not necessarily exhaustive for the pending claims.) 

For the reasons advanced above, applicants respectfully submit that the application is now in condition 
for allowance and that action is earnestly solicited. 

Respectfully submitted, 



Robert O. McGinnis 
Registration No. 44, 232 

Dec. 23, 2004 
1575 WestKagy Blvd. 
Bozeman, MT. 59715 
tel (406)-522-9355 
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Hidden linkage: a comparison of the affected sib pair (ASP) test and 
transmission/disequilibrium test (TDT) 
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(Received 3.11.97. Accepted 16.3.98) 
SUMMARY 

I compare the transmission/ disequilibrium test (TDT) and affected sib pair (ASP) test under a 
general algebraic model describing a bi-allelic disease locus. Assuming linkage to a bi-allelic marker, 
I derive two binomial probabilities, one for parental allele 'transmission' (P t ) which determines the 
magnitude of the TDT # 2 statistic (#tdt)> an d a second for identity-by-descent (ibd) marker allele 
'sharing' (P s ) which determines the magnitude of the ASP test statistic (#as P )- 1 also consider the ASP 
test applied to a completely polymorphic marker and demonstrate that the probability of ASP 
marker allele sharing (P s ) is identical to P s observed for a bi-allelic marker in equilibrium with the 
disease locus. I present a general framework for determining the power of the TDT and ASP test 
based on expressions for P v P s and the proportion (H/F) of ascertained parents who are informative 
at the marker. Two previous analytic investigations of TDT power based on the work of Ott (1989), 
and Risch & Merikangas (1996) are shown to be special cases of this general framework. In addition, 
I show the relationship between the framework I present and a third analytic investigation of TDT 
power for multi-allelic markers based on the work of Sham & Curtis (1995). 

INTRODUCTION 

Linkage has been demonstrated between insulin-dependent diabetes mellitus (IDDM) and the 
insulin gene region on chromosome llpl5. 5 on the basis of linkage analysis by the transmission/ 
disequilibrium test or TDT (McGinnis et al 1991 ; Spielman et al 1993). Linkage was demonstrated 
at the insulin 5'VNTR, a hypervariable marker that is extremely polymorphic, but whose VNTR 
alleles fall into two main size classes in Caucasians, thus forming a natural bi-allelic ( + / — ) marker. 
The + alleles were discovered to be positively associated with IDDM in case-control studies (Bell 
et al. 1984). Subsequent studies then demonstrated linkage in families collected for Genetic Analysis 
Workshop 5 (GAW5) by TDT analysis of GAW5 parents who were heterozygous ( + / — ) under the 
5'VNTR bi-allelic categories (Spielman et al 1993; see also Thomson et al 1989, Julier et al 1991). 

The very strong evidence for linkage provided by the TDT (# 2 = 8.26, p < 0.005) was both 
surprising and puzzling because identity-by-descent (ibd) sharing of 5'VNTR alleles in affected sib 
pairs (ASPs) yielded no evidence for linkage in the same GAW5 families. Indeed, evidence for 
linkage was completely undetected or 'hidden' because the proportion of alleles shared by ASPs 
did not exceed the null hypothesis value of 0.5 in two different types of ASP analysis. On one 
hand, there was no increase in ASP allele sharing when the analysis included all GAW5 families in 
which both parents were informative for any two lengths of 5'VNTR allele (Spielman et al 1989 ; 
Cox & Spielman, 1989). On the other hand, when the analysis included only those ASP parents 
who were evaluated by the TDT, namely those heterozygous ( + / — ) when the 5'VNTR is con- 
Address for correspondence: Dr. Ralph McGinnis, Senior Investigator, SmithKline Beecham, New Frontiers 
Science Park (North), Third Avenue, Harlow, Essex CM19 SAW. 
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sidered bi-allelic, there was again no evidence for linkage - in fact, fewer ASPs were concordant for 
a parental allele than were discordant (see table 7 in Spielman et al 1993). Thus, whether the 
5'VNTR was evaluated as highly polymorphic or as a bi-allelic marker associated with disease, 
ASP analysis failed to detect the strong evidence for linkage provided by the TDT. 

This striking divergence in ASP and TDT linkage results illustrates that more information is 
needed about the relative power of the two tests under various conditions. Here I provide such 
information by analytically comparing the power of the TDT and ASP test as a function of vari- 
ation in standard genetic parameters such as recombination fraction, disequilibrium, penetrance, 
and disease allele frequency. Based on a general algebraic model of linkage between a bi-allelic 
marker and bi-allelic disease locus, I derive two binomial probabilities. One probability for parental 
allele 'transmission' (P t ) determines the magnitude of the TDT x* statistic (Xtat)> an d a second 
probability for ibd marker allele 'sharing' (P s ) determines the magnitude of the ASP test statistic 
(#as P )- I a ^ so consider the ASP test applied to a completely polymorphic marker linked to a bi- 
allelic disease locus. In this situation, the probability of ASP marker allele sharing (P s ) is demon- 
strated to be identical to P s observed for a bi-allelic marker in equilibrium with the disease locus. 

The major findings of my investigation are as follows : 

(1) The TDT, but not ASP test, can detect linkage with high power, when homozygotes for a 
susceptibility allele have only 2 to 4-fold greater disease risk than homozygotes for the normal or 
wild-type allele. 

(2) TDT power is increased by disequilibrium between a bi-allelic marker and disease locus, and 
is also markedly increased when the disease allele and positively associated marker allele have 
similar population frequencies. 

(3) The algebraic expressions for P t and P s each contain a product of three factors whose mag- 
nitude is added to the null hypothesis value of 0.5: P t = 0.5+L t M t R t and P s = 0.5+ L S M S E S (see 
Results for the definition of each factor). The similarity of the corresponding factors in each ex- 
pression facilitates ASP and TDT comparisons, and the different factors in each product enable 
some 'partitioning' of the contribution to evidence for linkage provided by different genetic para- 
meters such as disequilibrium. 

Based on the expressions for P t ,P s and the proportion (H/F) of ascertained parents who are 
informative at the marker, I present a general framework for determining the power of the TDT 
and ASP test. Two previous analytic investigations of TDT power based on the work of Ott (1989), 
and Risch & Merikangas (1996) are shown to be special cases of this general framework. I also 
show the relationship between the framework I present and a third analytic investigation of TDT 
power for multi-allelic markers based on the work of Sham & Curtis (1995). 

P t and P s determine the magnitudes of xtat an d xlsp 

Blackwelder & Elston (1985) investigated power to detect linkage for several varieties of ASP 
test. Among these tests, the t 2 'mean' test was found to be the most powerful ASP test for most 
genetic parameter values. Therefore, I chose the t 2 test (henceforth referred to simply as the 'ASP 
test') to compare with the power of the TDT. For nuclear families with at least two affected sibs, 
the ASP test considers each parent separately and determines whether the parent transmitted the 
same marker allele (ibd) to both sibs of an ASP. The well known x 2 statistic for testing for linkage 
by the ASP test is: 



(tt s + nj ™asp 



,2 




where n s and n u are the number of instances in the data set in which a parental allele inherited by 
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one affected sib is shared (n s ) or 'unshared' (n u ) by the second affected sib; thus n s + n u = w asp is 
the sample size for #j[ sp and equals the number of trios in the data set consisting of an informative 
parent and an ASP. 

Unlike the ASP test which usually considers ASP allele sharing from parents informative for 
any two marker alleles, the TDT only considers parents heterozygous for two particular marker 
alleles (e.g. A/B only). For a set of nuclear families, the TDT counts the number of times each 
A/B parent transmitted allele A or B to individual affected offspring. As shown by Spielman et al. 
(1993), the x 2 statistic for detecting linkage by the TDT is: 

2 = K^ = Kz^ 

Xtdt K + tO "tdt 

where w a and are the number of instances in which an A/B parent transmitted allele A or B, 
respectively, to an individual affected offspring; and thus n^ + n^ = n tat is the sample size for Xtav 

Note that the algebraic expressions for # 2 sp anc * Xtdt are identical in form. In each # 2 , the de- 
nominator is the sample size of the data set. Thus, when sample size (n asp or n tdt ) is fixed, the 
denominator is constant and the magnitude of each # 2 is determined only by the size of the squared 
difference in the numerator [(n s — n u ) 2 or (n^—n^) 2 ]. 

A key idea in this paper is that the magnitude of the numerator in each # 2 is determined by a 
specific binomial probability. In the case of # 2 sp , this is the probability of ASP 'allele sharing' or 
P s , i.e. the probability that a randomly ascertained parent of an ASP transmitted the same marker 
allele (ibd) to both affected sibs. In the absence of linkage, P s = 0.5. But when linkage is present 
P s > 0.5, and the larger the value of (P s — 0.5), the more ASPs that exhibit allele sharing (n s ) and 
the higher the magnitude of # 2 sp . Similarly, a second binomial probability denoted P t (for prob- 
ability of 'allele transmission') determines the size of xfat- Pt * s tne probability that marker allele A 
was transmitted to a specific affected child by a randomly ascertained A/B parent of an ASP. When 
linkage and disequilibrium are present, P t 4= 0.5 and the larger the value of |P t — 0.5|, the greater 
the value of xlav 

General algebraic model of linkage 

At the beginning of Results, I give expressions for P s and P t based on the following general model : 
A bi-allelic marker with alleles A and B is linked to a bi-allelic disease locus with disease-predis- 
posing allele D and non-predisposing allele d. The model allows any penetrance for the D/D, D/d 
and d/d genotypes (a, /?, and y, respectively) such that l^a^O, 1 > /? ^ 0 and 1 ^ y ^ 0, and 
also assumes that no other locus underlies disease susceptibility. The recombination fraction (8) 
between marker and disease locus is variable as are the population frequencies of the four marker- 
disease locus haplotypes [f(AD) = c^fiAd) = c 2 J(BD) = c 3 ,f{Bd) = c 4 , where ^ + 02 + 03 + 04 = 1]. 

Note that once the haplotype frequencies are specified, the population frequency (p) of disease 
allele D is known (p = H-c 3 ), as are the frequencies (m, 1— m) of marker alleles A and B } respect- 
ively (m = c x + c 2 \ 1— m = c 3 + c 4 ). Furthermore, the coefficient of disequilibrium (S) equals 
c i c 4"~ c 2 c 3 and thus, when convenient, the haplotype frequencies can be expressed as c x = mp + 3, 
c 2 = m(l —p) — 8, c 3 = (l—m)p — 8, and c 4 = (1 — m) (1 — p) + S. 

RESULTS 

Based on derivations in Appendix I, equations (1) and (2) show expressions for P s and P t in terms 
of standard genetic variables for the general bi-allelic model described above. Both expressions 
assume that parents are ascertained through a randomly selected ASP, and each expression applies 



162 



R. E. McGinnis 



to an ascertained parent who is also heterozygous A/B at a bi-allelic marker. P s is the probability 
that the parent transmitted the same marker allele to both affected sibs. P t is the probability that 
the parent transmitted allele A to a particular affected child. 

P s = O.5 + (l-20)s ^^ +2p(l - p) ^ +{l - p y^ 

Equation 1 

Equation 2 

Note that the expressions for P s and P t are similar in form. When there is no linkage, both 
expressions equal 0.5; but when linkage is present an amount is added to 0.5 which, in each ex- 
pression, is a product of three factors. In both expressions, the leftmost factor depends on the 
recombination fraction (0), the middle factor on haplotype frequencies (c l5 c 2 > c 3> c 4) an d the quan- 
tity H (see Appendix I), and the rightmost factor on penetrances (a,/?,y) and the frequency (p) of 
disease allele D. Because they play analogous roles in each expression, I denote the leftmost factor 
in P s and P t as L s and L t , respectively; and similarly denote the rightmost factor as P s and P t , and 
middle factor as M s and M v Thus, P s = 0.5+ L S M S R S while P t = 0.5 +L t M t B t . 

Why \P t — 0-5\ > (P s — 0-5) token disequilibrium is extreme 

As described in the Introduction, the ASP approach failed to detect linkage at the insulin 
5'VNTR because the proportion of marker allele sharing in ASPs was close to 0.5, i.e. P s % 0.5 
whether the 5'VNTR was treated as bi-allelic or as highly polymorphic. By contrast, the TDT was 
able to detect linkage in the same families because P t ^ 0.60 (see Spielman et ah 1993). Thus, re- 
gardless of differences in relative sample size (n &sp) n tai ) for # 2 sp and Xtav the relative magnitudes of 
(P s -0.5) and |P t -0.5| are often the critical factor that causes a substantial difference in power for 
Xasp and xtav 

It is therefore interesting that analysis of equations (1) and (2) (see below) shows that when 
disequilibrium (8) reaches its most positive value (8 m&x ) or its most negative value (8 min ), the 
magnitudes of P s and P t are such that: (a) |P t -0.5| and (P s -0.5) are both maximized and (b) 
|P t -0.5| > (P s -0.5). This dependence on 8 of |P t -0.5| and (P s -0.5) also has other important impli- 
cations since the value of P s for a completely polymorphic marker is identical to the P s value of a 
bi-allelic marker in equilibrium (8 = 0) with a bi-allelic disease locus (see Appendix IV). Therefore, 
if ASP allele sharing for a completely polymorphic marker is denoted by (P s — 0.5)^, then P t and 
P s for any bi-allelic marker in extreme disequilibrium with the bi-allelic disease locus are such that: 
|P t -0.5| > (P s -0.5) > (P.-0.5W 

To understand the pivotal role of 8 in maximizing |P t — 0.5| and (P s -0.5), and in determining 
their relative magnitudes, consider the three corresponding factors in P t and P s . By inspection, L t 
= (1-26) ^L s = (1-20) 2 , and when disequilibrium is present, 6 should be near 0 and hence 
I t »L s «l, the maximum value of each factor. Furthermore, as shown below, R i is substantially 
greater than P s , and the difference between the two factors is independent 8 and 6 since R t and R s 
depend only on the properties of the disease locus (^,a,/?,y). Therefore, L t ^ L s and R t > R s , so it 
follows that |P t — 0.5| would always exceed (P s -0.5) were it not for the influence of the remaining 
two factors in equations (1) and (2) (M t anditf s ). 

Note, then, that M t and M s have the same denominator (H as defined in Appendix I); but the 
numerator of M t is 8 = c 1 c^ — c 2 c 3 , while the numerator of M s is the two components of 8 added 
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together (c^ + CgCg) implying that M s ^ \M t \. Since |M t | reaches its minimum value of 0 at equi- 
librium (S — 0), while M s is always positive, it follows that (P s — 0.5) > |P t — 0.5| « 0 in an interval 
of S values around S = 0. However, in Appendix III, I assume that marker allele frequency (m) 
and disease allele frequency (p) are fixed, and then show that \M t \ =M S at S = £ max and at S = # min . 
I also show that |M t | andikf s are both maximized at one of the two extreme S values (£ max or # min ). 
Therefore, for any bi-allelic marker (i.e. any m and p), when 8 equals S m8LX or <J min , |P t — 0.5| and 
(P s — 0.5) are maximized since |ili t | and M s are maximized; furthermore, because \M t \ =M S and 
L t «L S , the greater magnitude of R t compared to P s drives |P t — 0.5| higher than (P s — 0.5). 

Why R t > B s 

To understand why E t > P s , note first that both factors contain three components that are mul- 
tiplied by the coefficients p 2 /±, [2p(l—p)]/l6 and (1— p) 2 /4, respectively. In P s , each component 
has the form (U— V) 2 while the corresponding component in R t is (t/ 2 — V 2 ) where U and V (for the 
components multiplied by p 2 /4, [2p(i — p)]/16 and (1 — p) 2 /4) are, respectively, U = a,a + /?,/? and 
V = /?, /?-fy,y. Under the assumption that D/D penetrance exceeds d/d penetrance (a > y) and 
that D/d penetrance (/?) lies somewhere between (a ^ /? ^ y), each component in P t [C/ 2 — F 2 = 
(E/+F)(i7-F)] must exceed its counterpart in R s [(U-V) 2 ] since (U+V)> (U-V). The only 
exceptions occur when mode of inheritance is dominant (a = /?) or recessive (ft = y) in which case 
one pair of analogous components in R t and R s are equal ; however, the other two components in R t 
still exceed their counterparts in R s , and thus R t > R s . 

To assess how the elevation of R t above R s is influenced by the degree of risk conferred by the 
disease locus, the risk can be quantified by considering the penetrance of the D/D homozygote 
(a) to be r times greater than the penetrance of the d/d homozygote (y). Thus, <x — ry and the 
penetrance of D/d (/?) can be considered to fall between a and y by letting /? = y + x(<x — y) = 
y + x(r— l)y where x is a number between 0 and 1. Based on this parameterization, the a:y 
penetrance ratio (r) can be evaluated for its influence on R t and R s by dividing each component in 
R t by its conterpart in R s which yields the ratios : 

1 , 2 1 2 ,12 

1+* + , 2 + , 1+ . 

1— x r—l 2 r— 1 a; r — 1 

Note that r appears in each ratio only in the term 2/(r— 1) implying that R t /R s increases mono- 
tonically as r decreases. Thus, the elevation of R t above P s is most extreme for susceptibility loci 
causing a modest increase in disease risk as indicated by low values of r. 

In Tables 1 and 2 below, I show values of P s and P t when r = 2 and r = 4, respectively. In these 
tables, (P s — 0.5) w 0 indicating that linkage would be difficult to detect by the ASP approach; but 
|P t — 0.5| is much greater than (P s — 0.5) when disequilibrium is extreme, thus illustrating that at 
low r values, R t drives |P t — 0.5| to levels that provide strong evidence for linkage. 

Power of xlw and x 2 tAt 

I now show how to calculate and compare the power of xl sv an d Xt<\t when (a) both tests are 
applied to a bi-allelic marker or (b) the TDT is applied to a bi-allelic marker but the ASP test 
considers a completely polymorphic marker. I assume the two tests evaluate a series of S randomly 
ascertained parents of one or more ASPs, and that each test considers one ASP per parent. In the 
Discussion, I explain how to calculate the proportion (H/F) of the S parents who are informative 
at a bi-allelic marker. (The quantity F is proportional to the population frequency of parents who 
have two or more affected children, and H is proportional to the frequency of such parents who 
are also heterozygous at the marker.) Thus H/F determines the sample size for xlsp an d Xtat ( see 
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Proportion of A/B parents in ascertained families). For instance, if both tests evaluate the same 
bi-allelic marker, then sample size for #| sp is n asp = (H/F)S, while sample size for # t 2 dt is twice as 
large (n tdt = 2{H/F)S) since # asp counts pairs of transmitted alleles while xtdt counts individual 
alleles. Based on these sample sizes (n asi)) n tdt ) an( ^ tne values of P s and P t , the power of xlsp an( i 
Xf dt are determined from the binomial distributions 

^W-Jy- and $LrHl-PJ*. 

respectively, as explained below. 

Similarly, the power of # asp when applied to a completely polymorphic marker can also be de- 
termined from the appropriate binomial distribution, but n asp = S since all parents are informative, 
and P s = 0.5 + (1 — 26) 2 \jo(i—p)/F]R s as shown in Appendix IV. Interestingly, this expression for 
P s when the marker is completely polymorphic is identical to P s for a bi-allelic marker in equi- 
librium with a bi-allelic disease locus. This can be verified by setting S = 0 in c x = mp + S, c 2 = 
m(l-p) — 8, c 3 = (l— m)p — S, c 4 = (1 — m) (1 —p) + 8 and substituting for the four haplotype 
frequencies in the expression for H (see Appendix I) and in equation (1). 

Based on sample size (n^ n tdt ) an( ^ binomial probability (P s ,P t ), two binomial distributions are 
generated which can be used to calculate the power of xl sv and xtat as described in Appendix II. 
Specifically, the power of xLp or ^ e probability that xl sp > L (a significance cutpoint) is equal to 
the portion of the binomial distribution 

3^P»s(i_P s )»u for which n s >^+ V{n ™v L) . 
n s \n u \ 2 I 

Similarly, if marker allele A is associated with disease, the power of xlat * s estimated by the portion 
of the binomial distribution 

-^P t »a(l-P t ) nb for which tt a >^+ V(n r L) - 

Thus, standard tables giving the normal approximation to the binomial distribution (Pearson & 
Hartley, 1954; Weir, 1996) provide precise power values for virtually any sample size (w asp ,n tdt ), 
binomial probability (P s ,P t ), and significance level. 

Comparison of TDT and ASP power 

Here I illustrate how the equations for P v P s and H/F can be used to compare the power of xlat 
and # asp . I assume the two tests consider markers that are tightly linked (6 = 0) to bi-allelic disease 
loci with additive mode of inheritance (y? = (a + y)/2) and for which the a:y penetrance ratio is 
r = 2, r = 4 or r = 10. Penetrance ratios of r = 2, 4 and 10 were chosen as being somewhat rep- 
resentative of the entire genetic parameter space since I have found that P t and P s increase rapidly 
as r increases from 2 to 6 with smaller, asymptotic increases in P t and P s for r > 10. Furthermore, 
additive mode of inheritance may also be regarded as being somewhat representative since results 
from other modes of inheritance do not, in general, substantially differ from results presented here. 
In the tables below, I compare xlat and Xlsp when botn tests consider the same bi-allelic marker, or 
when # asp considers a fully informative marker and xlat evaluates a nearby bi-allelic marker. Such 
single test comparisons would be occasioned by: (a) TDT and ASP analysis of a marker that gave 
* suggestive' evidence of linkage and disease -association in other families or in comparisons of allele 
frequencies in cases and unrelated controls; or (b) TDT and ASP analysis of markers near a can- 
didate gene suspected of increasing disease susceptibility. 
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In Table 1 (r = 2), Table 2 (r = 4) and Table 3 (r = 10), column 1 shows disease allele frequency 
(p) for a bi-allelic disease locus, and columns 2, 3 and 4 show results for xls? applied to a fully 
informative marker. The remaining columns in each table list xtat an d xL v results for a linked bi- 
allelic marker whose allele frequency (m) is listed in column 5. Results are given for each value of 
m (0.75, 0.50, 0.25) assuming positive disequilibrium between the bi-allelic marker allele and disease 
allele is maximal (8 = 8 m&x ) or half-maximal (8 = ^ max ) where £ max = min [(1 -m)p > (1 —p)m\. TDT 
power (two-tailed test) and ASP power (one-tailed test) are for a significance level of 0.05 and are 
based on a sample size of 200 families (i.e. 400 parent-ASP trios) and thus n asp = 4Q0(H/F) while 
n m = 800(H/F). 

The ASP test can detect linkage over long distances (6 P 0) and in the absence of disequilibrium 
(8=0); but the TDT has no power when # = 0 and hence can detect linkage only over short 
distances (generally less than 1 cM). Yet when disequilibrium is half-maximal or greater (8 > ££ max ), 
Tables 1, 2 and 3 each show that TDT power almost always exceeds ASP power whether #£ sp is 
applied to a fully informative or bi-allelic marker. When r=2 (Table 1), linkage is virtually 
undetectable by xl sv > since (P s -0.5) ^ 0.13 and ASP power is 0.10 or lower; by contrast, the TDT 
is able to detect linkage but TDT power exceeds 0.50 only when 8 is close to 8 max and allele 
frequencies {m,p) are similar in magnitude at the marker and disease locus. For r = 4 (Table 2), 
ASP power is increased but still relatively low (< 0.33) for fully informative markers and for most 
bi-allelic markers. TDT power is also substantially higher and, for most markers, exceeds 0.95 when 
8 = £ max and exceeds 0.50 when 8 = |£ max , thus indicating that when r = 4, the TDT could dem- 
onstrate linkage to many disease loci whose linkage might be difficult or impossible to establish by 
the ASP test. 

For r = 10 (Table 3), TDT power is reasonably high (^ 0.66) when 8 ^ ^ max and ASP power is 
also elevated ( > 0.50) except at the highest disease allele frequency shown (p = 0.6) where ASP 
power is 0.28 for a fully informative marker. Thus as r increases from 2 to 10, the tables show that 
P s and ASP power increase substantially and hence, when r = 10, the relative power advantage of 
the TDT is diminished. Nevertheless, as indicated by lower ASP power when p — 0.6 at r = 10 
(Table 3), ASP power at elevated disease allele frequencies (p > 0.6) remains low (< 0.50) even 
when r-*oo (data not shown; table available from the author). For example, if the same power 
analysis shown in Tables 1-3 were conducted for r = oo (i.e. y = 0) then for a fully informative 
marker and disease allele frequency of p = 0.75, P s and ASP power would be 0.519 and 0.19, re- 
spectively. By contrast, TDT power would be much higher (^ 0.80) but only when 8 ^ f# max and 
m is close to p = 0.75 (i.e. 0.65 ^ m ^ 0.85). 

In concluding this section, I emphasize that Tables 1-3 show that when the disease locus and 
marker are bi-allelic, TDT power is substantially increased if the disease allele and positively 
associated marker allele have similar frequencies. Miiller-Myshok & Abel (1997) independently 
made a similar observation, but they emphasized the weakness of TDT power when the m/p ratio 
departs from unity and 8 is not close to £ max . However, the tables illustrate that similar frequencies 
for the disease allele and associated marker allele can increase TDT power to reasonably high levels 
even when the m/p ratio substantially differs from 1 and 8 is much lower than 8 m&x . For example, 
in Table 3 (r = 4), note that when 8 = %8 mSLX and p = 0.15, a similar frequency (m = 0.25) for the 
disease-associated marker allele produces TDT power of 0.86 and P t of 0.581 ; but when p = 0.15 
and m = 0.5 at 8 = |£ max , TDT power and P t fall to 0.53 and 0.547, respectively. The difference in 
TDT power for these two situations can also be quantified by calculating the mean value of xtat 
based on a sample of 200 ASP families and the values of P t and H/F in Table 4 [i.e. xlat = 
800(#/P)(2P t -l) 2 ]. When p = 0.15 and m = 0.5, xlxt = 3 - 53 yielding a significance level of p = 
0.06 ; but when p = 0.15 and m = 0.25, # t 2 dt = 9.02 for a significance level of p < 0.003. The large 
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difference in significance level (0.06 versus 0.003) and power (0.53 versus 0.86) illustrated by this 
example indicates that careful attention to allele frequencies at bi-allelic markers may play an 
important role in future efforts to map susceptibility loci. 

DISCUSSION 

The equations for P s , P t and H/F enable comparison of TDT and ASP power for the same family 
data, since the three expressions assume random ascertainment of parents of two or more affected 
children. However, the TDT can be applied to families with a single affected child, so in Appendix 
I, I derive an expression analogous to P t (denoted Pj* ) which gives the probability that allele A was 
transmitted to an affected child by a randomly ascertained A/B parent of one or more affected 
offspring. The derivation of P t * is almost identical to that of P t , and hence the algebraic form of Pf 
is similar to that of P t and P s : 

Pf-0.5+(l-a» ^!5=fi +jK i_, ) <5=20 + (i-,,.^ , 

Equation 3 

where H* = 2(c t c 4 + c 2 c 3 ) pa - V7 * @- + 7 + 2c, c z {pa -?/? + /?} + 2c 2 c^pfi-py + y} 

Previous analyses of TDT power are special cases of the current analysis 

I now show that two previous analytic investigations of TDT power (Terwilliger & Ott, 1992; 
Risch & Merikangas, 1996) are special cases of the current analysis. These two analytic 
investigations and a third by Sham & Curtis (1995) as well as simulation and computer-based 
analyses of TDT power (Schaid & Sommer, 1994; Clerget-Darpoux et al. 1995; Kaplan et al. 1997) 
all assumed the disease locus to be bi-allelic. Sham & Curtis (1995) and Kaplan et al (1997) con- 
sidered a multi-allele marker locus, but the other analyses assumed either a bi-allelic marker or a 
direct test of the disease polymorphism itself, and each analysis examined TDT power for one or 
several specific modes of inheritance. 

Terwilliger & Ott (1992) considered a recessive disease with no phenocopies in families 
ascertained through a single affected child. By investigating the same recessive model, Ott (1989) 
had previously derived an algebraic probability for transmission of each marker allele (denoted H 
and h) to affected offspring by heterozygous H/h parents and by both types of homozygous parent 
(see Ott's table II). Thus, power results for the TDT (McNemar's test in figure 3 of Terwilliger & 
Ott) can be derived by using Ott's table II to compute Pf for the recessive, zero-phenocopy model. 
P t * is derived by considering only the two probabilities in Ott's table for heterozygous H/h parents, 
and by dividing the transmission probability for allele H by the sum of the probabilities for allele 
H and allele h to yield : 

(m + 3/p)(i-m)-dd/p 
1 (m + 8/p)(l-m) + m[(l-m)-3/py 

where, substituting my notation for Ott's, m is the frequency of marker allele H (or, alternatively, 
my allele A), p is the frequency of disease allele D, and S is the coefficient of disequilibrium. This 
expression for P t * is seen to be a special case of equation (3) by making the appropriate penetrance 
substitutions (a > 0,fi = y = 0) into my expressions for Pf and //*, and by expressing c v c 2 , c 3 and 
c 4 in terms of m, p and 3 according to standard expressions given above (see General algebraic 
model of linkage). 

Risch & Merikangas (1996) compared TDT and ASP power for an intermediate mode of inherit- 
ance in which D/d penetrance (ft) is a multiple (k) of d/d penetrance and r = k 2 . For their analysis, 
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the TDT was assumed to test the disease locus itself or a perfectly associated bi-allelic marker 
and, under this assumption, Risch & Merikangas found that P t * = P t = Jc/(l + k). With appropriate 
substitutions (0 = 0,a = k z y,/3 = ky,c x = p,c^ = 1-p, c 2 = c d = 0), equations (2) and (3) for P t and 
P t * also simplify to fc/(l + A;), thus agreeing that for this particular model, P t and Pf are (a) identical 
and (b) independent of disease allele frequency. For many other genetic models, P t and P t * do 
appear to be similar in value (though not identical). However, graphical analysis of P t shows that 
the value of P t is independent of disease allele frequency only for the particular mode of inheritance 
considered by Risch & Merikangas (graphs showing this are available from the author). 

The expression in Risch & Merikangas (1996) for P s (their Y) when a marker is fully informative 
can also be shown to be a special case of equation (1) for P s . This is verified by substituting the 
appropriate mode of inheritance parameters (a = k 2 y,fi = ky) into equation (1) and also by sub- 
stituting parameters for a closely linked marker in equilibrium with the disease locus (i.e. 8 = 0 
and S - 0 in c x = mp + 8, c 2 = m(l-p)-8, c 3 = (l-m)p-S and c 4 = (1-ra) (l-p) + 8). In Appen- 
dix IV and Results (see Power of # 2 sp and # t 2 dt ), I showed that when a bi-allelic marker is in equi- 
librium with the disease locus, equation (1) for P s is identical to the expression for P s when a marker 
is fully informative as was assumed by Risch & Merikangas (1996). 

Proportion of A /B parents in ascertained families 

Since power analyses often calculate power for a specific number of ascertained families, the 
proportion of informative A /B parents in such families must be calculated to determine the subset 
of parents to which the TDT or ASP test is applied when a marker is bi-allelic. Among parents 
ascertained through one affected child, it can be shown that the expected proportion of A/B 
parents is H*/F* where H* is as previously defined (see equation (3)) and P* = 
p 2 a + 2p(l— p)/?+(l-p) 2 y. Similarly, the expected proportion of A/B parents among those 
ascertained through an ASP can be shown to be H/F where H is as defined in Appendix I and 

P = 2>*a 2 + 4p 3 (l-p) \ 

2p*{i-p?P + ±p*(\-p? ^±M±Z 2 + 4p(l-2>) 3 i±L 2 +( i_ p )4 y 2 

With appropriate substitutions (a = k 2 y,fi — ky), the expressions H*/F* and H/F reduce to the 
corresponding expressions given by Risch & Merikangas (1996) for the proportion of heterozygous 
parents found in families having at least one and two affected children, respectively. Furthermore, 
for the recessive, zero-phenocopy disease considered by Ott (1989), the sum of the two probabilities 
for heterozygous parents in Ott's table II gives the proportion of heterozygous parents in families 
ascertained through a single affected child. When appropriate substitutions are made (a > 0,/? = 
y = 0), the expression H*/F* also reduces to the proportion of heterozygous parents predicted by 
Ott's table. It is also important to note that Sham & Curtis (1995) derived a table of probabilities 
analogous to Ott's table II, except that their table 3 has entries for a variable number of marker 
alleles and their probabilities describe a general model of disease. If table 3 of Sham & Curtis is 
assumed to have only two marker alleles, then the two probabilities for heterozygous parents pre- 
dict a probability of allele transmission identical to P* (equation (3)) as well as a proportion of 
heterozygous parents in ascertained families which is identical to H*/F*. 

Power of xtdt f or a multi-allelic marker 

So far the four haplotype frequencies (c v c 2 , c 3 ,c 4 ) have represented a bi-allelic marker linked to 
a bi-allelic disease locus; but these frequencies could also correspond to any two marker alleles 
(a^ttj) of a multi-allelic marker linked to a bi-allelic disease locus [c x =f(a i D),c 2 =f(a i d),c 3 = 
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/(ajD),c 4 = f(a i d)]. The expression for H*/F* (or H/F) would then be the proportion of ascertained 
parents expected to be heterozygous for a { and a j5 and the expression for P t * (or P t ) would be the 
conditional probability that aja^ parents transmit allele a x to affected offspring. Thus, in principle, 
the expressions for Pf (or P t ) and H*/F* (or H/F) could be used to investigate the power of any 
strategy for applying the TDT to a multi-allelic marker. 

Here I briefly discuss a strategy recommended by Spielman & Ewens (1996) in which xtat ls 
calculated for each allele i of a multi-allelic marker (i = 1 to k) by evaluating parents heterozygous 
for allele i and the other alleles grouped together (non-t). The marker is then tested for linkage by 
evaluating the statistical significance of the largest of the k #tdt' s using significance outpoints 
adjusted for multiple testing and non-independence of the chi-squares (see Ewens & Spielman 
[1997] for a table of these cutpoints). The power of this procedure can be estimated for any multi- 
allelic marker model as follows: For each i/non-i determine the haplotype frequencies c^c^c^c^ 
and calculate the associated values of P? and H*/F* (or P t , H/F andP s ). Then determine the i/non- 
i likely to give the highest xfat by calculating the expected value of each xtat [^(Xtat)]- (For S 
parents of an ASP, it can be shown that E(xtat) = 2(S— 1) (2P t - 1) 2 + 2P S while for singletons, 
E(Xtat) = (£-l)(2P t *-l) 2 +l.) For the i/non-i giving the highest E(xl ai ), TDT power would then 
be determined exactly as for a bi-allelic marker (see Power of xlat an d xlsp) except that an adjusted 
significance cutpoint would be used as described above. 

To briefly examine power for a particular multi-allelic example, consider the bi-allelic marker 
and disease locus in the bottom line of Table 2 (r = 4). The frequencies (p and m) of disease allele 
D and positively associated marker allele A are 0.15 and 0.25, respectively, and TDT power is 0.99 
when 8 = 8 mSLX and 0.86 when 8 = |<? max . Suppose p and m remain constant as does the degree of 
positive association between alleles A and D (8 = 8 max or ^ max ) but suppose the marker consists of 
k— 1 additional (non-A) alleles having negative or no association with disease allele D. Then A/non- 
A would give the highest EiXtat) of any i/non-i and thus TDT power would be determined by the 
P t and H/F shown in the bottom line of the table. According to Ewens & Spielman (1997), adjusted 
cutpoints (0.05 significance) for k = 2, k = 4 and k = 8 are xlat = 3 - 84 > 6.10 and 7.41, respectively; 
thus TDT power when k = 2, 4 or 8 would be 0.86, 0.71 or 0.61 at 8 = £# max and would be 0.99 for 
k ^ 8 when 8 = 8 m&x . This example suggests that TDT power for a multi-allelic marker remains 
relatively strong if (a) one marker allele is strongly associated with either allele of a bi-allelic disease 
locus and (b) the two associated alleles have similar population frequencies. 

Concluding remarks 

Strength of evidence for linkage provided by # asp and Xtat critically depends upon the magnitude 
of departure from the null hypothesis value of 0.5, the size of departure being quantified by 
(P s -0,5) and |P t -0.5| for the ASP and TDT paradigms, respectively. In this paper, I have shown 
that (P s — 0.5) and |P t — 0.5| are each a product of three corresponding factors [(P s — 0.5) = 
L S M S R S , |P t — 0.5| =L t \M t \E t ]. L s and L t depend only on the recombination fraction (6), R s and E t 
depend only on disease penetrance (a,/?,y) and the frequency (p) of the disease allele and, fur- 
thermore, marker allele frequency (m) and disequilibrium (8) influence only M s and \M t \. Hence, 
the corresponding factors in (P s — 0.5) and |P t — 0.5| facilitate comparisons between the ASP and 
TDT paradigms, and also enable some 'partitioning 5 of the contribution to evidence for linkage 
provided by standard genetic variables such as 6, 8, m, etc. 

Together with the expression for parental heterozygosity at the marker (H/F), the expressions 
for P s and P t provide a general framework for calculating and comparing the power of # asp and xtat- 
This framework generalizes the ASP-TDT comparison of Risch & Merikangas (1996) by encom- 
passing many modes of inheritance rather than just one, and also by enabling TDT power to be 
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calculated for a marker that is distinct from the disease locus. Analysis of the equations shows 
that TDT power is greatly increased if disequilibrium is strong and if the disease allele and posi- 
tively associated marker allele have similar population frequencies. The equations also show that 
the superior power of the TDT compared to the ASP test is greatest when susceptibility loci confer 
modest disease risk, as indicated by low values of the penetrance ratio r. When a marker is strongly 
associated with a disease locus that contributes modest disease risk, |P t — 0.5| > (P s — 0.5) « 0. Thus, 
the TDT is likely to play an important role in detecting and replicating linkages to loci responsible 
for complex genetic disease. 

I am deeply grateful to Richard Spielman for encouragement and valuable suggestions as this work developed. I 
am also indebted to Warren Ewens for valuable comments and for criticism that improved the manuscript. This 
research was supported by NIH grants DK46618 and DK47481 and by grant 193189 from the Juvenile Diabetes 
Foundation. 
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APPENDIX I 

Derivation of expressions for P s , P v H 

The derivations assume the general model of a bi-allelic marker and linked bi-allelic disease locus 
that is the only locus that underlies disease susceptibility (see General algebraic model of linkage 
in the main text). I begin the derivation of P s and P t (equations (1) and (2)) by first deriving 



172 



R. E. McGinnis 



Table Al. Conditional on mating type, the probability that allele D or d is transmitted from a specific 

D/d parent to any specific affected child 

Probability of transmission to affected 



Mating type Allele D Allele d 

a + P a + p 

a+p p+y 



D/dxD/D 
FCI D/d x D/d 
D/d x d/d 



a + 2p + y a + 2p + y 

fi y 

P+y p+y 



Table A2. Families ivith a heterozygous A /B father and N children, at least k of whom are affected 
Mating type Probability of transmission to affected 



Fa Mo 


'Weighted' frequency 


Allele A 


Allele B 


AD/BdxD/D 


OL + P k 

2(c 1 c 4 )pW 


*-0(<x-P) 
a + p 


P-d(P-a) 
a + p 


AD/BdxD/d 


Hc iCi )p(i-p)N f— r - 


(a+p)-0(ot-y) 
<x + 2p + y 


(P+y)-0(y-a) 

a + 2p+y 


AD/Bdxd/d 


P+y k 


p-O(P-y) 
P+y 


y-Q{y-P) 
P+y 


Ad/BDxD/D 


OL + P k 

2(c 2 c 3 )p*N -f- 


P-d(P-a) 
a + p 


a-0(a-p) 
a+p 


Ad/BD xD/d 


a + 2P+y k 


(P+y)-d(y-a) 
a+2p + y 


(a+p)-Q(*-y) 
a+2p+y 


Ad/BD x d/d 


P+y k 
2(c 2 cJ(l-p)*N ^1 


y-6(y-p) 
P+y 


P-O(P-y) 
P+y 


AD/BD x D/D 


2(c 1 c z )p 2 Na ]l 


0.5 


0.5 


AD/BD x D/d 


a+P k 

±(c x c z )p(\-p)N 


0.5 


0.5 


AD/BD x d/d 
Ad/BdxD/D 


2(c lC3 )(l-p)*Np« 
2(c 2 c 4 )pW/? k 


0.5 
0.5 


0.5 
0.5 


Ad/BdxD/d 


P+y k 

Mc 2 c 4 )p(l-p)N 


0.5 


0.5 


Ad/Bd x d/d 


2(c 2 c 4 )(l- P rNy* 


0.5 


0.5 



expressions for two related probabilities denoted P 2A and P 2B . In the main text, I assume ascer- 
tainment of parents through a randomly selected ASP and note that equations (1) and (2) apply 
to an ascertained parent who is also informative A/B at a bi-allelic marker. P 2A is the probability 
that the parent transmitted allele A to both offspring in the ASP, and P 2B is the probability that 
B was transmitted to both. Since P s is the probability that an ascertained A/B parent transmitted 
the same marker allele to an ASP, it follows that P s =P 2A +P 2 b- Similarly, P t is the probability 
that the A/B parent transmitted allele A to a specific affected child, and hence 

Pt = P2A +iU -P*A -P*b) = 1 + -PzbY 

Therefore, the expressions for P s and P t can be found by deriving expressions for (P 2A +P 2B ) and 
for (P 2A —P 2B ). As a preliminary step in finding these expressions, I present Table Al which shows 
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the three mating types that contain at least one parent who is heterozygous (D/d) at the disease 
locus. Conditional upon mating type, the table gives the probability that a particular D/d parent 
in the family transmitted allele D or allele d to any specific affected child. Assuming that the bi- 
allelic disease locus is the only locus responsible for disease susceptibility, it can be shown that the 
conditional probabilities in Table Al are valid, regardless of the number of children in the family 
or the number who are affected. 

Using the probabilities in Table Al, I now derive expressions for P 2A and P 2B . To simplify the 
derivation, I calculate P 2A (P 2 b) as ^ ne conditional probability that allele A (allele B) was trans- 
mitted to two affected sibs by an A /B father randomly selected from a subpopulation of families 
having exactly N children, two or more of whom are affected. I then show that the results are 
identical to those for A/B parents randomly selected from families of any size that have two or 
more affected offspring. 

Based in part on the conditional probabilities in Table Al, the entries in Table A2 describe 
families with an A/B father and N children, at least k of whom are affected. For a subpopulation 
of families having exactly N children, the subpopulation frequency for each mating type in column 
1 of Table A2 can be subdivided into a summation of component frequencies with each component 
corresponding to the exact number (r) of affected children in the family. Specifically, the com- 
ponent frequency for each mating type with exactly r affected children (k ^ r ^N) is found by 
multiplying three factors : (a) the general population frequency of the father's genotype as defined 
by the two marker-disease locus haplotypes in the A/B father times (b) the general population 
frequency of the disease locus genotype in the mother times (c) the conditional probability that 
the mating type produces exactly r affected offspring among the N children. For the mating type 
on the first line of Table A2, the subpopulation or component frequency for families with r affected 
offspring would be : 

. N a + /J r ol + P N ' r 

2M«>** r — i— r • 

The probability that a particular mating type with r affected children will be randomly selected 
(ascertained) from the subpopulation is directly proportional to the component frequency 
'weighted' by (multiplied by) a factor (/ w ) that depends on the random ascertainment scheme. 
When only one affected child is required for family ascertainment (k = 1), the probability of ascer- 
tainment is directly proportional to the number (r) of affected children in the family since any of 
these is a potential proband. Thus/ W = r when 4=1. When an affected sib pair (k = 2) is required 
for ascertainment, I adopt the ascertainment scheme for ASPs that is implicit in the calculation of 
A s (Risch, 1990). In this scheme, a proband (affected individual) is randomly selected from the 
population and then one of the proband's sibs (affected or unaffected) is randomly selected. Only 
pairs in which the second sib is affected contribute to the magnitude of the recurrence risk (K s ) 
and to the magnitude of A s = KJK (see Risch, 1990) and thus only these pairs (and the families 
that produced them) are ascertained. By contrast, pairs in which the second sib is unaffected do 
not contribute to the magnitude of K s or A s and are, in effect, discarded (family not ascertained). 
Based on this ascertainment scheme for ASPs, the component frequency for a mating type with r 
affected offspring would be weighted by/ w = r(r — 1)/(N— 1) since (r— l)/(iV— 1) is the probability 
of randomly selecting a second affected sib from the N— 1 children who remain after proband 
selection. 

For each mating type in column 1 of Table A2, the probability of randomly selecting or 
ascertaining a family with at least k affected offspring is directly proportional to the sum of the 
weighted component frequencies for which r is greater than or equal to k (i.e. k ^ r ^ N). For each 
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mating type in Table A2, this sum simplifies to the overall 'weighted 5 subpopulation frequency 
shown in column 2. For example, when k = 2 (and hence / w = r(r— 1)/{N— 1)), the sum for the 
mating type in line 1 simplifies as follows: 



E r N-l 2{ ° lC ' )P r 2 1 2 



JV-r 



= 2(c x c A )p*N 



a + /? 



2 



Based on the 'weighted' subpopulation frequencies in column 2, what then is the probability of 
randomly selecting a particular ( A/B father-mating type' from among families with an A/B father 
and N children, at least two of whom are affected ? Setting k = 2, the probability (P rs ) of randomly 
selecting a particular mating type would be the weighted frequency of the mating type divided by 
the sum of all the weighted frequencies in column 2. Thus, the probability of randomly selecting 
the mating type on line 1 (AD/BdxD/D) would be: 

^ 2( Cl c 4 ) a + /7 2 
H V 2 

where # = 2(c 1 c 4 + c 2 c 3 ) +Jp(l-p) + U~:P) 2 

+ 2 Cl c 3 {p 2 a 2 + Jp(l -;>) (a + /?) 2 + (1 -p) 2 /? 2 } 

+ 2c 2 c 4 w 2 +4p(i - p) (/?+ r) 2 + (i -p) 2 r 2 }- 

Note that the quantity N cancels in the numerator and denominator of P rs , thus demonstrating 
that the probability is independent of family size (N) and hence applies to mixed populations of 
families of any size that have 2 or more affected offspring. Furthermore, if mating types with A/B 
mothers were included in Table A2, the only effect would be to double each frequency in column 2; 
but P rs would be unchanged since the f 2V in the doubled numerator and denominator of P rs would 
cancel. Thus, by setting k = 2, the P TS calculated for each mating type in Table A2 applies to random 
selection of A/B parents of two or more affected offspring from families of any size. 

What, then, is the probability that a randomly selected A/B parent of a particular mating type 
transmitted allele A (allele B) to an affected child? For each mating type in Table A2, the two 
rightmost columns show the conditional probability that the A/B parent transmitted allele A or B 
to an individual affected offspring. These probabilities follow directly from the conditional proba- 
bilities in Table Al. For example, in the AD/BdxD/D mating in line 1 of Table A2, the A/B parent 
has allele A in coupling with allele D, and B in coupling with rf, and thus Table Al implies that A 
is transmitted to affected offspring with a probability of 

a p _ a-g(a-yg) 
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Therefore, from the results of the previous paragraph, the joint probability that (a) the 
AD/BdxD/D mating type (line 1) is randomly selected and (b) the A/B parent transmitted allele 
A to both affected sibs of an ASP would be : 

2(c 1 c 4 )p 2 a + 0 \ q-fl(a-/7) 2 = 2{c 1 c A )p* a-fl(a-/?) 2 
H 2 ' a+fi H 2 

P 2A would then be the sum of this joint probability and the corresponding joint probabilities for 
the other 11 mating types in Table A2: 

HZ 4 Z 

+ ^ y .|' +2p(1 . P) i±i' +(1 _ rf |' 

Similarly, P 2B equals: 

H 2 4 .6 



2( Cl c 3 ) 2 a » a+y? ■ /? 

— L J J£ P 2 g +2p(l-p) — ;p +(l-2>) 2 | 



2 



Therefore, 

Equation 4 

Hence, equation (2) in the main text follows from the equation P t = l+Ki^ ~~^2b) gi yen a ^ tfte 
beginning of Appendix I. Similarly, by adding P 2A and P 2B ,P S = (P^ + ^b) simplifies, after some 
algebra, to equation (1). 

Derivation of expressions for P t * and H* 

To simplify this derivation, I use 'P A ' to denote what appears in the main text as the probability 
l P*\ P A (P B ) is the probability that allele A (allele B) was transmitted to an individual affected 
child by a randomly ascertained A/B parent of one or more affected offspring. So by setting k = 1 
in Table A2, the derivations of P A , P B and H* are analogous to the derivations of P 2A , P 2B and H 
described above. Thus, summing the frequencies of the mating types in Table A2 for k = 1 and 
factoring out N, I obtain : 

H* = 2 ( Cl c 4 + c 2 c 3 ) V a = P7 + P + 7 + 2c x c 3 {y a -pfi + /J} + 2c 2 c 4 {p/? - + 7} 
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Similarly, P A (P B ) equals an expression identical to the expression for P 2A (P 2B ) shown above except 
that H* replaces H and each quantity inside square brackets is not squared in the corresponding 
expression for P A (P B ). Based on these expressions for P A and P B , the expression for {P A —P B ) 
simplifies in a manner analogous to (P 2A —P 2 b) [ see equation (4)]. Therefore, the expression for P A 
can also be simplified by using the relation (P A +P B ) = 1 which implies that P A = \+{(P A —P B )- This 
simplified expression for P A is shown in the main text (equation (3) for ). 



appendix n 

Power of xl* v and # t 2 dt 

Assume that each of S parents is independently and randomly ascertained through an ASP. 
Suppose we wish to determine the power of # 2 sp anc * Xtdt when applied to those parents who are 
informative (A/B) at a bi-allelic marker. If there are h heterozygous A/B parents, and if xUv and 
x\ ai consider one ASP per parent, then the random variables i and j denote the following three 
subdivisions of the h parents : 

i = number of A/B parents who transmit allele A to both affected offspring, or the number of ASPs 
that share allele A 

j = number of A/B parents who transmit allele B to both affected offspring, or the number of ASPs 
that share allele B 

(h — i—j) = number of A/B parents who transmit allele A to one affected offspring and allele B to the 
other, or the number of ASPs that share neither allele A nor B 

As shown in Spielman et al. (1993), xtat anc * Xlsp can De expressed in terms of A, i and j as follows: 
Xtdt = [ 2 (*-j) 2 ]A and Xlsp = [(2(i4-^) — A) 2 ]/^. Hence, the probability of each possible {ij) com- 
bination gives the probability of each possible value of xlat anc * #Lp- 

What, then, is the probability of each (i,j) combination, when a total of A A/B parents are 
randomly selected (ascertained) from A/B parents of one or more ASPs? The probability that the 
first randomly selected parent transmitted A or B to both affected offspring is given by P 2A and 
P 2B , respectively (see Appendix I). Furthermore, sampling with replacement would apply, if h is 
small relative to the population being sampled. Therefore, P 2A , P 2B and (1 — P 2A —P 2B ) would be the 
probabilities of the three possible outcomes generated by each randomly selected parent, and since 
the random selections are independent, the joint probability distribution for is: 

P(i ' j) = i!j!(A-t-j)l {P * A)i {PzB)j (1 ~ P * A ~ P ^ )h ~ i ~ j - Equation 5 

Consequently, xLp power or the probability that xLp > L (a significance cutpoint) is given by 
the sum of the P(i,j) terms for which (i+j) > h/2 + V(hL)/2. Similarly, the power of xl<\t is the 
sum of P(i, j) terms for which > ->/(hL/2). However, these power determinations are simplified 
numerically and conceptually by reducing the trinomial distribution to one of two binomial distri- 
butions. With respect to ^ sp > * randomly selected A/B parents can be regarded as a series of h 
binomial trials, each having (P 2A -\-P 2B ) probability of generating an ASP that shares parental 
allele A or B. Consequently, xlsp P° w er is determined by the binomial distribution 



(i+j)\(h-i-j)\ 



^7(P M +Ab) <+, (1-^m-Ah)*' w 
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or, substituting notation from the main text, 




with power specifically equal to the portion of the distribution for which 



" s >n asp /2 + VKsp£)/2. 



Although the binomial distribution based on P s and 7i asp trials enables exact calculation of xl sv > 
power, the power of Xtat can De precisely estimated, but not calculated exactly, from a second 
binomial distribution generated by P v Among h randomly selected A/B parents of an ASP, each 
parent would transmit allele A to an individual affected child with a probability of 
P 2A+\{l- p 2A- p *B) =\+j( P 2a- P 2b) = P v Tm 's suggests that xtat power can be determined by the 
binomial distribution 



The estimated power of xlat would then equal the portion of the distribution for which 
\^&" n h\ > V(wtdt^)- This is equivalent to summing the two 'tails' of the distribution, i.e. the 
portion of the distribution for which n & > n tat /2 + VKdt^)/ 2 P lus tne portion for which 
n h >n tai /2 + V(n iat L)/2. 

Thus, computer summation of the binomial terms in the two tails gives an estimate of xtat power 
that can be compared with exact power obtained by summing the terms of the P(i, j) trinomial 
distribution (equation (5)) for which \i-j\ > \/(hL/2). I performed this comparison by calculating 
P t , P 2A and P 2B for 1040 genetic parameter combinations that derived from five different groups 
(209 combinations per group). For three groups, the marker and disease locus were assumed to be 
identical, but the groups differed by assuming an a:y penetrance ratio (r) of 2, 4 or oo. Mode of 
inheritance (x) for these groups was defined by the numerical 'distance 5 of D/d penetrance (/?) 
between D/D penetrance (a = ry) and d/d penetrance (y), i.e. /? = z(r— l)y + y. By allowing x to 
vary between x = 0 and x = 1 in increments of 0.1 and allowing disease allele frequency (p) to vary 
between p = 0.05 and p = 0.95 in increments of 0.05, 209 ordered (x,p) pairs or parameter combin- 
ations were formed for each of the three groups (11 x 19 = 209). The two remaining groups were 
for r = 4 or r = oo, and both groups assumed a bi-allelic marker with equally frequent alleles 
(m = 0.5) to be tightly linked (6 = 0) to a bi-allelic disease locus with additive mode of inheritance 

= (a + y)/2). For these two groups, the variable x represented degree of disequilibrium between 
marker and disease locus (x = 0 was equilibrium, x = 1 maximum disequilibrium). 209 parameter 
pairs were formed for different combinations of disequilibrium (x) and disease allele frequency (p). 

By assuming a data set of 100 A/B parents of an ASP, I calculated xtat power at each parameter 
combination for significance levels of 0.05, 0.01 and 0.001 by obtaining (a) the binomial power 
estimate based on P t and (b) the exact trinomial power value based on P 2A and P 2B . The exact 
trinomial value and binomial estimate differed by less than 0.01 for 95% of the 3135 possibilities 
tested and by less than 0.02 for 98%. The largest overestimate of TDT power differed from the 
exact value by 0.037 and the largest underestimate differed by 0.052. These results show that the 
binomial estimate based on P t is very precise. 




(1— P t )" b where n tdi = 2h y n & = h+(i—j), and n b = h—(i—j). 
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APPENDIX III 

Influence of disequilibrium on P t and P s 

To demonstrate how disequilibrium affects P t and P s , I assume all parameters describing a bi- 
allelic marker and linked, bi-allelic disease locus are fixed, except for the degree of disequilibrium 
(8) between the two loci. Hence, all parameters are constant in P s (equation (1)) and P t (equation 
(2)) except for the haplotype frequencies (c lf c 2i c 3t c A ) in the middle factor (M s or M t ) of each 
equation [c x = mp + S,c 2 = m(i — p) — #,c 3 — (1— m)p — #,c 4 = (1 — m) (1— p) + 8]. Therefore, by 
rewriting M s and M t in terms of 8, m and p, and then determining the 8 value that maximizes M s 
(or \M t \) we also determine the 8 that maximizes (P s — 0.5) (or \P t — 0.5|). 

I focus first on M & , the middle factor in P s . By definition, M s = [c x c x + c 2 c 3 ) / f H where i/ = 
2(c!C 4 + c 2 c 3 ) W0 d + 2c 1 c 3 W DD + 2c 2 Ci W dd , and lf D(i , and are functions of a, /? and y, and 
hence are constant (see Appendix I for full expression for H). The 8 values that minimize and 
maximize M s are found by solving the equation cM s /d8 = 0 for 8. There are two solutions (roots) 
but only one root (# 0 ) falls with the interval of genetically possible values of 8 bounded by 8 min 
a nd £ max : 

_ 2m(l - m) Y-V[m(l-m)] ; V[(^ 0 ) (W^) (4m(l - m) - 1) + P] 
(2m-l)(^-^) 

where F = [p(PF M - W dd ) + 

Since it can be shown that (d 2 M s /d8 2 > 0 when 8 = 8 0 , it follows that 8 = 8 Q minimizes M Si and 
P s . Because there are no other maxima or minima for M s between 8 min and # max , the 8 value that 
maximizes M s , and P s must be one of the two endpoints of the interval (8 min or £ max ). This is true 
even when 8 = 8 0 happens not to fall within the interval since dMJdS would then be positive or 
negative throughout the interval, implying thatif s is maximized at one endpoint and minimized 
at the other. 

I now turn to M t , and show that \P t — 0.5| is always maximized at the same value (8 min or 8 max ) 
that maximizes (P s — 0.5). By definition, M t = (c^ — c 2 c 3 ) j H = 8/H where H is as defined above. 
Since H > 0,M t — 8/H may be positive or negative. Therefore, when all parameters in P t are fixed 
except 8, the maximum value of the function [M t (£)| also maximizes |P t — 0.5|. In this connection, 
note that71f t = {c 1 c A — c 2 c^)/H is identical toM s = (c^ + CgCg)/// except that in the numerator, CjC 4 
and c 2 c 3 are added rather than subtracted. Since H and the haplotype frequencies c v c 2 , c 3 and c 4 
are always positive, M s ^ \M t \ at any value of 8. However, it is well known that <J max = ra(l — p) if 
p < m or £ max = p(i—m) if p > m, and that 8 min = —mp if p ^ (1 — m) or 8 min = — (1 — m) (1 —7)) 
if p>(l — m) [Ott, 1991]. Since Cj = rap + 8, c 2 = m(t—p) — 8, c d = (l—m)p — 8 and c 4 = 
(1 — m) (1 — p) + 8, the expressions for # max and 8 min imply that c x c 4 = 0 or c 2 c 3 = 0 at both 8 = £ max 
and 8 = 5 min . Hence, M s = \M t \ &t 8 = 8 m&x and at 8 = 8 min . Therefore, the same value (£ max or 8 min ) 
that maximizes M s must also maximize \M t \ and \P t — 0.5|. 

APPENDIX IV 

Derivation of expression for P s for a fully informative marker 

To derive P s for a fully informative or completely polymorphic marker, suppose that a bi-allelic 
marker is perfectly associated with a linked (6 < 0.5), bi-allelic disease locus which implies that c 2 
= c 3 = 0,Cj = 30, c 4 = 1— p and hence, from equation (1), P s = 0.5 + (1 — 2d) 2 [p(l — p)/H]R s . As 
explained in the Discussion ('Proportion of A/B parents in ascertained families'), this value of P s 
applies only to the proportion (H/F) of ascertained parents who are informative (A/B) at the bi- 
allelic marker. But, in this case, the marker and disease locus are perfectly associated, and hence 
these ascertained A/B parents must also be informative (D/d) at the disease locus. By contrast, 
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because the marker and disease locus are perfectly associated, the remaining proportion (i—H/F) 
of ascertained parents must be homozygous (D/D or d/d) at the disease locus as well as homozygous 
at the bi-allelic marker; hence these doubly homozygous parents would transmit each marker allele 
(ibd) to affected offspring with equal probability. Therefore, P s would equal 0.5 for these doubly 
homozygous parents if they were made informative by testing at a linked marker that is completely 
polymorphic. Hence, if all parents were tested at the completely polymorphic marker, the value of 
P s for the completely polymorphic marker would equal : 

1-| (0.5)+ | O.5 + (l-20) 2 R s =O.5 + (l-20) 2 R s . 

As noted in Results (see Power of #£ sp and xl<\t)> tn ^ s expression for P s is identical to that obtained 
for a bi-allelic marker in equilibrium with a bi-allelic disease locus. 



